44 research outputs found

    Techniques for document image processing in compressed domain

    Full text link
    The main objective for image compression is usually considered the minimization of storage space. However, as the need to frequently access images increases, it is becoming more important for people to process the compressed representation directly. In this work, the techniques that can be applied directly and efficiently to digital information encoded by a given compression algorithm are investigated. Lossless compression schemes and information processing algorithms for binary document images and text data are two closely related areas bridged together by the fast processing of coded data. The compressed domains, which have been addressed in this work, i.e., the ITU fax standards and JBIG standard, are two major schemes used for document compression. Based on ITU Group IV, a modified coding scheme, MG4, which explores the 2-dimensional correlation between scan lines, is developed. From the viewpoints of compression efficiency and processing flexibility of image operations, the MG4 coding principle and its feature-preserving behavior in the compressed domain are investigated and examined. Two popular coding schemes in the area of bi-level image compression, run-length and Group IV, are studied and compared with MG4 in the three aspects of compression complexity, compression ratio, and feasibility of compressed-domain algorithms. In particular, for the operations of connected component extraction, skew detection, and rotation, MG4 shows a significant speed advantage over conventional algorithms. Some useful techniques for processing the JBIG encoded images directly in the compressed domain, or concurrently while they are being decoded, are proposed and generalized; In the second part of this work, the possibility of facilitating image processing in the wavelet transform domain is investigated. The textured images can be distinguished from each other by examining their wavelet transforms. The basic idea is that highly textured regions can be segmented using feature vectors extracted from high frequency bands based on the observation that textured images have large energies in both high and middle frequencies while images in which the grey level varies smoothly are heavily dominated by the low-frequency channels in the wavelet transform domain. As a result, a new method is developed and implemented to detect textures and abnormalities existing in document images by using polynomial wavelets. Segmentation experiments indicate that this approach is superior to other traditional methods in terms of memory space and processing time

    Document Image Analysis Using a new Compression Algorithm

    Full text link
    By proper exploitation of the structural characteristics existing in a compressed document, it is possible to speed up certain image processing operations. Alternatively, one can derive a compression scheme which would lend itself to an efficient manipulation of documents without compromising the compression factor. Here, a run-based compression technique is discussed for binary documents. The technique, in addition to achieving bit rates comparable to other compression schemes, preserves document features which are useful for analysis and manipulation of data. Algorithms are proposed to perform vertical run extraction, and similar operations in the compressed domain. These algorithms are implemented in software. Experimental results indicate that fast analysis of electronic data is possible if data is coded according to the proposed scheme

    Effectiveness of Polynomial Wavelets in Text and Image Segmentation

    Full text link
    Wavelet transforms have been widely used as effective tools in texture segmentation in the past decade. Segmentation of document images, which usually contain three types of texture information: text, picture and background, can be regarded as a special case of texture segmentation. B-spline wavelets possess some desirable properties such as being well localized in time and frequency, and being compactly supported, which make them a good approach to texture analysis. In this paper, cubic B-spline wavelets are applied to document images; thereafter, each texture is featured by several regional and statistical characteristics estimated at the outputs of high frequency bands of spline wavelet transforms. Then three-means classification is applied for classifying pixels which have similar features. We also examine and evaluate the contributions of different factors to the segmentation results from the viewpoints of decomposition levels, frequency bands and feature selection, respectively

    Document Segmentation Using Polynomial Spline Wavelets

    Full text link
    Wavelet transforms have been widely used as effective tools in texture segmentation in the past decade. Segmentation of document images, which usually contain three types of texture information: text, picture and background, can be regarded as a special case of texture segmentation. B-spline wavelets possess some desirable properties such as being well localized in time and frequency, and being compactly supported, which make them an effective tool for texture analysis. Based on the observation that text textures provide fast-changed and relatively regular distributed edges in the wavelet transform domain, an efficient document segmentation algorithm is designed via cubic B-spline wavelets. Three-means or two-means classification is applied for classifying pixels with similar characteristics after feature estimation at the outputs of high frequency bands of spline wavelet transforms. We examine and evaluate the contributions of different factors to the segmentation results from the viewpoints of decomposition levels, frequency bands and wavelet functions. Further performance analysis reveals the advantages of the proposed method

    <span style="font-size:11.0pt;font-family: "Times New Roman";mso-fareast-font-family:SimHei;mso-bidi-font-family:Mangal; color:black;mso-ansi-language:EN-GB;mso-fareast-language:EN-US;mso-bidi-language: HI" lang="EN-GB">Distribution of CFCs and its tracer study in the Chukchi Sea and adjacent areas</span>

    No full text
    295-303During the second Chinese Arctic Scientific Expedition (CHINARE 2003. 2003.07 ~ 2003.09), chlorofluorocarbons (CFCs) were measured from water samples collected from 26 stations in the Chukchi Sea and adjacent areas. CFCs data indicated that they had not reach saturation in surface waters. The 170°W section distribution of CFCs and the thermohaline characteristics in the Chukchi Sea confirmed that there was an inflow of Pacific Ocean water into the Arctic Ocean via the Central Channel. There are three main new results. The first, there were two sources of freshwater, the Alaskan Coastal Water (ACW) and sea ice melt water, in the shallower than 20 m seawater. The second, the water mass at station BS09A was ACW, and that stations BS06A and BS07A were likely to be a place of confluence for ACW and Bering Sea shelf Water (BSW), or for ACW, BSW and Anadyr Water (AW) water masses. The third, the ACW almost flowed along the coast, and so it had less influence on the more distant offshore stations

    An Algorithm with Reduced Operations for Connected Components Detection in Itu-T Group 3/4 Coded Images

    Full text link
    An algorithm, which performs connected components detection in the course of decoding ITU-T (former CCITT) facsimile Group 3/4, i.e., MH/MR/MMR compressed images is presented. New definitions of mode color and a new transition element are introduced that allow MR/MMR codes to analyze and derive information about connection of black runs in two adjacent scan lines in the course of decoding. The experiments on the standard set of eight CCITT documents have shown that, on the average, the complexity of direct processing of MR/MMR codes is lower by a factor of 20 and 2.5 than that for raster images and MH codes processing respectively. Data structures for image vector description are discussed

    Distribution of CFCs and its tracer study in the Chukchi Sea and adjacent areas

    No full text
    During the second Chinese Arctic Scientific Expedition (CHINARE 2003. 2003.07 similar to 2003.09), chlorofluorocarbons (CFCs) were measured from water samples collected from 26 stations in the Chukchi Sea and adjacent areas. CFCs data indicated that they had not reach saturation in surface waters. The 170 degrees W section distribution of CFCs and the thermohaline characteristics in the Chukchi Sea confirmed that there was an inflow of Pacific Ocean water into the Arctic Ocean via the Central Channel. There are three main new results. The first, there were two sources of freshwater, the Alaskan Coastal Water (ACW) and sea ice melt water, in the shallower than 20 m seawater. The second, the water mass at station BSO9A was ACW, and that stations BSO6A and BSO7A were likely to be a place of confluence for ACW and Bering Sea shelf Water (BSW), or for ACW, BSW and Anadyr Water (AW) water masses. The third, the ACW almost flowed along the coast, and so it had less influence on the more distant offshore stations.National Natural Science Foundation of Chin
    corecore